Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 20 de 245
Filter
1.
Journal of Business & Economic Statistics ; 41(3):846-861, 2023.
Article in English | ProQuest Central | ID: covidwho-20245136

ABSTRACT

This article studies multiple structural breaks in large contemporaneous covariance matrices of high-dimensional time series satisfying an approximate factor model. The breaks in the second-order moment structure of the common components are due to sudden changes in either factor loadings or covariance of latent factors, requiring appropriate transformation of the factor models to facilitate estimation of the (transformed) common factors and factor loadings via the classical principal component analysis. With the estimated factors and idiosyncratic errors, an easy-to-implement CUSUM-based detection technique is introduced to consistently estimate the location and number of breaks and correctly identify whether they originate in the common or idiosyncratic error components. The algorithms of Wild Binary Segmentation for Covariance (WBS-Cov) and Wild Sparsified Binary Segmentation for Covariance (WSBS-Cov) are used to estimate breaks in the common and idiosyncratic error components, respectively. Under some technical conditions, the asymptotic properties of the proposed methodology are derived with near-optimal rates (up to a logarithmic factor) achieved for the estimated breaks. Monte Carlo simulation studies are conducted to examine the finite-sample performance of the developed method and its comparison with other existing approaches. We finally apply our method to study the contemporaneous covariance structure of daily returns of S&P 500 constituents and identify a few breaks including those occurring during the 2007–2008 financial crisis and the recent coronavirus (COVID-19) outbreak. An package "” is provided to implement the proposed algorithms.

2.
Fractals ; : 1, 2023.
Article in English | Academic Search Complete | ID: covidwho-20242709

ABSTRACT

This paper is to investigate the extent and speed of the spread of the coronavirus disease 2019 (COVID-19) pandemic in the United States (US). For this purpose, the fractional form of the susceptible-exposed-infected-recovered-vaccinated-quarantined-hospitalized-social distancing (SEIR-VQHP) model is initially developed, considering the effects of social distancing, quarantine, hospitalization, and vaccination. Then, a Monte Carlo-based back analysis method is proposed by defining the model parameters, viz. the effects of social distancing rate (α), infection rate (β), vaccination rate (ρ), average latency period (γ), infection-to-quarantine rate (δ), time-dependent recovery rate (λ), time-dependent mortality rate (κ), hospitalization rate (ξ), hospitalization-to-recovery rate (ψ), hospitalization-to-mortality rate (ϕ), and the fractional degree of differential equations as random variables, to obtain the optimal parameters and provide the best combination of fractional order so as to give the best possible fit to the data selected between January 20, 2020 and February 10, 2021. The results demonstrate that the number of infected, recovered, and dead cases by the end of 2021 will reach 1.0, 49.8, and 0.7 million, respectively. Moreover, the histograms of the fractional order acquired from back analysis are provided that can be utilized in similar fractional analyses as an informed initial suggestion. Furthermore, a sensitivity analysis is provided to investigate the effect of vaccination and social distancing on the number of infected cases. The results show that if the social distancing increases by 25% and the vaccination rate doubles, the number of infected cases will drop to 0.13 million by early 2022, indicating relative pandemic control in the US. [ FROM AUTHOR] Copyright of Fractals is the property of World Scientific Publishing Company and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full . (Copyright applies to all s.)

3.
Journal of Computational and Graphical Statistics ; 32(2):483-500, 2023.
Article in English | ProQuest Central | ID: covidwho-20241312

ABSTRACT

In this article, a multivariate count distribution with Conway-Maxwell (COM)-Poisson marginals is proposed. To do this, we develop a modification of the Sarmanov method for constructing multivariate distributions. Our multivariate COM-Poisson (MultCOMP) model has desirable features such as (i) it admits a flexible covariance matrix allowing for both negative and positive nondiagonal entries;(ii) it overcomes the limitation of the existing bivariate COM-Poisson distributions in the literature that do not have COM-Poisson marginals;(iii) it allows for the analysis of multivariate counts and is not just limited to bivariate counts. Inferential challenges are presented by the likelihood specification as it depends on a number of intractable normalizing constants involving the model parameters. These obstacles motivate us to propose Bayesian inferential approaches where the resulting doubly intractable posterior is handled with via the noisy exchange algorithm or the Grouped Independence Metropolis–Hastings algorithm. Numerical experiments based on simulations are presented to illustrate the proposed Bayesian approach. We demonstrate the potential of the MultCOMP model through a real data application on the numbers of goals scored by the home and away teams in the English Premier League from 2018 to 2021. Here, our interest is to assess the effect of a lack of crowds during the COVID-19 pandemic on the well-known home team advantage. A MultCOMP model fit shows that there is evidence of a decreased number of goals scored by the home team, not accompanied by a reduced score from the opponent. Hence, our analysis suggests a smaller home team advantage in the absence of crowds, which agrees with the opinion of several football experts. Supplementary materials for this article are available online.

4.
Epidemic Analytics for Decision Supports in COVID19 Crisis ; : 1-158, 2022.
Article in English | Scopus | ID: covidwho-20238851

ABSTRACT

Covid-19 has hit the world unprepared, as the deadliest pandemic of the century. Governments and authorities, as leaders and decision makers fighting against the virus, enormously tap on the power of AI and its data analytics models for urgent decision supports at the greatest efforts, ever seen from human history. This book showcases a collection of important data analytics models that were used during the epidemic, and discusses and compares their efficacy and limitations. Readers who from both healthcare industries and academia can gain unique insights on how data analytics models were designed and applied on epidemic data. Taking Covid-19 as a case study, readers especially those who are working in similar fields, would be better prepared in case a new wave of virus epidemic may arise again in the near future. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022.

5.
Revista Mexicana de Economia y Finanzas Nueva Epoca ; 18(2), 2023.
Article in Spanish | Scopus | ID: covidwho-20237508

ABSTRACT

Many of the sectors in the economy were negatively affected, particularly insurance sector with the appearance of COVID-19. With the support of governments or reinsurers through the payment of a premium, insurance companies could receive a contingent resource in the face of excess infections caused by the pandemic. This paper calculates the premium to cover the excess affected population with a financial options model with a diffusion process without and with Poisson jumps and the Susceptible-Infected-Recovered (SIR) epidemiological model (this estimation is original). The obtained system is approximated with the Monte Carlo simulation method. The results show that there are important differences in the option premiums when Poisson jumps are included. Lastly, it is highlighted that the premium depends on the behavior trajectory of contagions and strike contagion value (K). This work has a limitation when applied to very particular cases, but a calibration of the parameters with more real information could be done in future research. © 2023 Russell Sage Foundation. Lewis-McCoy, R. L'Heureux, Natasha Warikoo, Stephen A. Matthews, and Nadirah Farah Foley. 2023.

6.
Epidemic Analytics for Decision Supports in COVID19 Crisis ; : 83-102, 2022.
Article in English | Scopus | ID: covidwho-20237299

ABSTRACT

There are several techniques to support simulation of time series behavior. In this chapter, the approach will be based on the Composite Monte Carlo (CMC) simulation method. This method is able to model future outcomes of time series under analysis from the available data. The establishment of multiple correlations and causality between the data allows modeling the variables and probabilistic distributions and subsequently obtaining also probabilistic results for time series forecasting. To improve the predictor efficiency, computational intelligence techniques are proposed, including a fuzzy inference system and an Artificial Neural Network architecture. This type of model is suitable to be considered not only for the disease monitoring and compartmental classes, but also for managerial data such as clinical resources, medical and health team allocation, and bed management, which are data related to complex decision-making challenges. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022.

7.
IOP Conference Series Earth and Environmental Science ; 1166(1):012040, 2023.
Article in English | ProQuest Central | ID: covidwho-20234746

ABSTRACT

In the maritime industry, The unanticipated COVID-19 viral epidemic is an unforeseeable circumstance, and other nations implemented enormous containment measures to stop the Coronavirus epidemic from spreading around the world. Thus, directly affecting the maritime shipping sector. This paper will discuss the current problems facing the shipping industry, taking into account the congestion problems, delays, and uncertainty timeframes, using the Los Angeles port as a case study. These problems and more were addressed directly by increasing the operating hours, workload, and available staff, and indirectly by looking for alternatives for shipping goods, and creating more cargo space, furthermore, this study will use Monte Carlo simulation to predict the effectiveness of these solutions on the congestion at the port.

8.
Indian Journal of Medical Microbiology ; 45 (no pagination), 2023.
Article in English | EMBASE | ID: covidwho-20232484

ABSTRACT

Purpose: Compared to nasopharyngeal/oropharyngeal swabs (N/OPS-VTM), non-invasive saliva samples have enormous potential for scalability and routine population screening of SARS-CoV-2. In this study, we investigate the efficacy of saliva samples relative to N/OPS-VTM for use as a direct source for RT-PCR based SARS-CoV-2 detection. Method(s): We collected paired nasopharyngeal/oropharyngeal swabs and saliva samples from suspected positive SARS-CoV-2 patients and tested using RT-PCR. We used generalized linear models to investigate factors that explain result agreement. Further, we used simulations to evaluate the effectiveness of saliva-based screening in restricting the spread of infection in a large campus such as an educational institution. Result(s): We observed a 75.4% agreement between saliva and N/OPS-VTM, that increased drastically to 83% in samples stored for less than three days. Such samples processed within two days of collection showed 74.5% test sensitivity. Our simulations suggest that a test with 75% sensitivity, but high daily capacity can be very effective in limiting the size of infection clusters in a workspace. Guided by these results, we successfully implemented a saliva-based screening in the Bangalore Life Sciences Cluster (BLiSC) campus. Conclusion(s): These results suggest that saliva may be a viable alternate source for SARS-CoV-2 surveillance if samples are processed immediately. Although saliva shows slightly lower sensitivity levels when compared to N/OPS-VTM, saliva collection is logistically advantageous. We strongly recommend the implementation of saliva-based screening strategies for large workplaces and in schools, as well as for population-level screening and routine surveillance as we learn to live with the SARS-CoV-2 virus.Copyright © 2023 Indian Association of Medical Microbiologists

9.
Indoor Air ; : 1-24, 2023.
Article in English | Academic Search Complete | ID: covidwho-20232043

ABSTRACT

The COVID-19 pandemic outbreak has increased the general awareness of the importance of proper ventilation in the indoor environment to reduce the contagion risk. In particular, attention has been paid to specific categories of buildings, such as schools, due to two factors: (1) high occupancy density and (2) the presence of young and sometimes more susceptible people. Despite the high level of alertness towards the ventilation of classrooms, robust analyses of the effectiveness of the different strategies to mitigate the contagion risk have been difficult to perform. Indeed, the COVID-19 pandemic is still ongoing, and many factors, such as the presence of multiple viral strains, use of facial masks, progression in vaccination, and installation of air purifiers and other sanitization devices, make it difficult to fully quantify the impact of room ventilation by simply analysing available monitoring data. Moreover, mitigation strategies related to ventilation are often dynamic, increasing the complexity of the problem to assess. In this framework, this work proposes a new Monte Carlo method integrated with building performance simulation to evaluate the number of infected occupants under different scenarios, considering also the dynamic boundary conditions. The described approach has been applied to a case study classroom at the Free University of Bozen-Bolzano, Italy, analysing almost 100 different scenarios and discussing the effectiveness of different ventilation strategies traditionally adopted to ensure suitable IAQ according to CO2 concentration limits. Results highlight the importance of combining different solutions (e.g., mixed-mode ventilation and facial masks) to limit the risk for both students and lecturers. [ FROM AUTHOR] Copyright of Indoor Air is the property of Hindawi Limited and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full . (Copyright applies to all s.)

10.
J Appl Stat ; 50(8): 1853-1875, 2023.
Article in English | MEDLINE | ID: covidwho-20241422

ABSTRACT

In this paper, reparameterization and student-t are applied to Stochastic Volatility (SV) model. We aim to reduce the amount of autocorrelation of the SV parameters and to introduce heavy-tailed model via the Bayesian computation of the Markov Chain Monte Carlo (MCMC) samplers. This research paper helps support better MCMC estimation of the SV model for volatile Asian FX series during Covid-19.

11.
Decision Making: Applications in Management and Engineering ; 6(1):219-239, 2023.
Article in English | Scopus | ID: covidwho-2322042

ABSTRACT

The overall purpose of this paper is to define a new metric on the spreadability of a disease. Herein, we define a variant of the well-known graph-theoretic burning number (BN) metric that we coin the contagion number (CN). We aver that the CN is a better metric to model disease spread than the BN as the CN concentrates on first time infections. This is important because the Centers for Disease Control and Prevention report that COVID-19 reinfections are rare. This paper delineates a novel methodology to solve for the CN of any tree, in polynomial time, which addresses how fast a disease could spread (i.e., a worst-cast analysis). We then employ Monte Carlo simulation to determine the average contagion number (ACN) (i.e., a most-likely analysis) of how fast a disease would spread. The latter is analyzed on scale-free graphs, which are specifically designed to model human social networks (sociograms). We test our method on some randomly generated scale-free graphs and our findings indicate the CN to be a robust, tractable (the BN is NP-hard even for a tree), and effective disease spread metric for decision makers. The contributions herein advance disease spread understanding and reveal the importance of the underlying network structure. Understanding disease spreadability informs public policy and the associated managerial allocation decisions. © 2023 by the authors.

12.
2nd International Conference on Biological Engineering and Medical Science, ICBioMed 2022 ; 12611, 2023.
Article in English | Scopus | ID: covidwho-2327141

ABSTRACT

This paper analyzes the influence different disease control and prevention strategies on the out- break of COVID-19 in a susceptible-infected-quarantined-recovered-died (SIQDR) model. This paper builds a continuous dynamical system model and a discrete Monte Carlo model to simulate the spread of COVID-19. This paper discusses how different control and prevention policies affect the spread of COVID-19. Besides, this paper also measures the impact of different policies on the economy to help the government choose a more appropriate policy. © 2023 SPIE.

13.
Heliyon ; 9(6): e16358, 2023 Jun.
Article in English | MEDLINE | ID: covidwho-2327303

ABSTRACT

The expectation in global demand for liquified natural gas (LNG) remains bullish in the coming years. Despite the unprecedented impact of the COVID-19 pandemic, and the oil price wars between OPEC and Russia in 2020, causing oversupply and falling prices, the LNG markets continue to demonstrate flexibility and resilience in delivering the needs of different sectors, whilst helping achieve the emissions targets. This is attributed to the high competitiveness amongst LNG producers and suppliers, providing greater confidence for medium-to-long term demand. However, the uncertainties in the current outlook for the return of demand and price growth in the post-COVID period pose difficulty for new liquefaction project investment decisions in the pre-Investment Decision Phase (pre-FID). Accordingly, the consideration of new production and selling strategies is needed in the early design stages of projects to cope with the shift in buyers' sentiments favouring increased reliance on spot and short-term uncontracted volumes, as well as incorporating additional flexibility into long-term contracts. In this study, the economic valuation of the flexible Air Product's AP-X liquefaction technology was investigated considering the modelling of price volatilities, using the mean-reverting jump-diffusion pricing model and Monte Carlo simulation, assuming different demand level scenarios in the high-income Asia Pacific markets based on historical trends. The results clearly demonstrate that embedding flexibility within an LNG production system allows producers and suppliers to diversify selling strategies, and take advantage of the lucrative market conditions when demand and prices increase, and hedge against market risks when demand and prices are low.

14.
Sustainability ; 15(9):7381, 2023.
Article in English | ProQuest Central | ID: covidwho-2320934

ABSTRACT

The transportation industry is characterized as a capital-intensive industry that plays a crucial role in economic and social development, and the rapid expansion of this industry has led to serious environmental problems, which makes the eco-efficiency analysis of the transportation industry an important issue. Previous research paid little attention to the regulatory scenarios and suffered from the incomparability problem, hence this paper aims to reasonably estimate the eco-efficiency and identify its evolutionary characteristics. We measure the eco-efficiency and the corresponding global Malmquist–Luenberger productivity index using a modified model of the data envelopment analysis framework, in which different regulatory constraints are incorporated. Based on the empirical study on the transportation industry of thirty provinces in China, we find that the eco-efficiency of Chinese transportation industry experienced a slight increase during 2015–2016, a sharp decline during 2016–2017, and a continuous rise since year 2017. The Middle Yangtze River area was the best performer among the eight regions in terms of eco-efficiency, while the Southwest area was placed last. The global Malmquist–Luenberger productivity index showed an earlier increase and later decrease trend, which was quite consistent with the reality of the variation of inputs and outputs and the emergence of COVID-19. Moreover, the best practice gap change was found to be the main driven force of productivity. The empirical results verify the practicability of our measurement models and the conclusions can be adopted in guiding the formulation of corresponding policies and regulations.

15.
Topics in Antiviral Medicine ; 31(2):441, 2023.
Article in English | EMBASE | ID: covidwho-2320431

ABSTRACT

Background: A need exists for safe, affordable, and effective antiviral treatments for less severe COVID-19 outpatients that can prevent infection progression, hospitalization, and death;shorten the time to clinical recovery;and reduce transmission. In our best knowledge, there are not, so far, costeffectiveness analysis on oral antiviral COVID-19 drugs in Spain. In our study we aim to evaluate cost-effectiveness of oral nirmatrelvir plus ritonavir in COVID-19 mild to moderate outpatients with at least one risk factor for disease progression in Spain. Method(s): A simulation model was constructed in R, to assess the clinical consequences and costs associated with COVID-19 in a hypothetical cohort of non-hospitalized patients older than 65 years with mild-to-moderate COVID and at least one risk factor for progression in Spain. The intervention assessed was nirmatrelvir plus ritonavir 300 mg plus 100mg every 12 hours up to 5 days. The comparator was symptomatic treatment with no antiviral drugs against SARSCoV- 2. The study was contextualized in the Spanish National Health System and the perspective of the service provider was adopted. Quality of life adjusted life years (QALYs) was used as a measure of effectiveness. Drug effectiveness was obtained from a literature review. As a cost measure, the retail price of the drugs was used. As a threshold willing to pay, the Spanish Gross National Product per capita was used. A discount of 3% per year was applied on future health effects. We used a decisional tree model. A univariate sensitivity analysis and probabilistic sensitivity analysis was performed. Result(s): We found that nirmatrelvir/ritonavir yielded an extra 620.89 QALYs compared to a baseline scenario without it, at an increase in cost of 89,630,442 with an Incremental cost-effectiveness ratio of 144,356.4 /QALY gained. One way sensitivity analysis and probabilistic sensitivity analysis using Monte-Carlo simulations were undertaken and showed that the probability of not being costeffective was 1 at the current price and willingness to pay threshold. To meet our willingness to pay threshold, nirmatrelvir plus ritonavir 5-days treatment price should be lowered down to 70 . Conclusion(s): According to our analysis nirmatrelvir/ritonavir is not costeffective in in the Spanish National Health System for outpatients older than 65 years with at least one risk factor for COVID progression. A drug price of 70 per treatment would meet our willingness to pay threshold.

16.
Computational & Applied Mathematics ; 42(4), 2023.
Article in English | ProQuest Central | ID: covidwho-2319325

ABSTRACT

Mark–recapture sampling schemes are conventional approaches for population size (N) estimation. In this paper, we mainly focus on providing fixed-length confidence interval estimation methodologies for N under a mark–recapture–mark sampling scheme, where, during the resampling phase, non-marked items are marked before they are released back in the population. Using a Monte Carlo method, the interval estimates for N are obtained through a purely sequential procedure with an adaptive stopping rule. Such an adaptive decision criterion enables the user to "learn” with the subsequent marked and newly tagged items. The method is then compared with a recently developed accelerated sequential procedure in terms of coverage probability and expected number of captured items during the resampling stage. To illustrate, we explain how the proposed procedure could be applied to estimate the number of infected COVID-19 individuals in a near-closed population. In addition, we present a numeric application inspired on the problem of estimating the population size of endangered monkeys of the Atlantic forest in Brazil.

17.
International Journal of Wine Business Research ; 35(2):256-277, 2023.
Article in English | ProQuest Central | ID: covidwho-2318845

ABSTRACT

PurposeThis paper aims to formulate a hedonic pricing model for Japanese rice wine, sake, via hierarchical Bayesian modeling estimated using an efficient Markov chain Monte Carlo (MCMC) method. Using the estimated model, the authors examine how producing regions, rice breeds and taste characteristics affect sake prices.Design/methodology/approachThe datasets in the estimation consist of cross-sectional observations of 403 sake brands, which include sake prices, taste indicators, premium categories, rice breeds and regional dummy variables. Data were retrieved from Rakuten, Japan's largest online shopping site. The authors used the Bayesian estimation of the hedonic pricing model and used an ancillarity–sufficiency interweaving strategy to improve the sampling efficiency of MCMC.FindingsThe estimation results indicate that Japanese consumers value sweeter sake more, and the price of sake reflects the cost of rice preprocessing only for the most-expensive category of sake. No distinctive differences were identified among rice breeds or producing regions in the hedonic pricing model.Originality/valueTo the best of the authors' knowledge, this study is the first to estimate a hedonic pricing model of sake, despite the rich literature on alcoholic beverages. The findings may contribute new insights into consumer preference and proper pricing for sake breweries and distributors venturing into the e-commerce market.

18.
Pharmaceutical Sciences Asia ; 50(1):9-16, 2023.
Article in English | EMBASE | ID: covidwho-2317731

ABSTRACT

The pharmacokinetic (PK) drug-drug interactions (DDIs) of nelfinavir and cepharanthine combination is limited information in human. In addition, the dosage regimen of this combination is not available for COVID-19 treatment. The objective of this study was to perform in silico simulations using GastroPlusTM software to predict physicochemical properties, PK parameters using the physiologically based pharmacokinetic (PBPK) model of healthy adults in different dosage regimens. The DDIs analysis of nelfinavir and cepharanthine combination was carried out to optimize the dosage regimens as a potential against COVID-19. The Spatial Data File (SDF) format of nelfinavir and cepharanthine structures obtained from PubChem database were used to carry out in silico predictions for physicochemical properties and PK parameters using several aspects of modules such as ADMET Predictor, Metabolism and Transporter, PBPK model. Subsequently, all data were utilized in the DDIs simulations. The dynamic simulation feature was selected to calculate and investigate the Cmax, AUC0-120, AUC0-inf, Cmax ratio, AUC0-120 ratio, and AUC0-inf ratio. The victim or nelfinavir dosage regimens were used four oral administration regimens of 500 mg and 750 mg in every 8 and 12 hours for simulations. The perpetrator or cepharanthine oral dosage regimens were used in several regimens from 10 mg to 120 mg in every 8, 12, and 24 hours. From all predicted results, the dosage regimen as a potential combination against COVID-19 was nelfinavir 500 mg every 8 hours and cepharanthine 10 mg every 12 hours.Copyright © 2023 by Faculty of Pharmacy, Mahidol University, Thailand is licensed under CC BY-NC-ND 4.0. To view a copy of this license, visit https://www.creativecommons.org/licenses/by-nc-nd/4.0/.

19.
Heliyon ; 9(5): e15850, 2023 May.
Article in English | MEDLINE | ID: covidwho-2313837

ABSTRACT

This paper estimates the impact of the Covid-19 pandemic on the economic and financial performance of the Portuguese mainland hotel industry. For that purpose, we implement a novel empirical approach to gauge the impact of the pandemic during the 2020-2021 period in terms of the industry's aggregated operating revenues, net total assets, net total debt, generated cash flow, and financial slack. To that end, we derive and estimate a sustainable growth model to project the 2020 and 2021 'Covid-free' aggregated financial statements of a representative Portuguese mainland hotel industry sample. The impact of the Covid pandemic is measured by the difference between the 'Covid-free' financial statements and the historical data drawn from the Orbis and Sabi databases. An MC simulation with bootstrapping indicates that the deviations of the deterministic from the stochastic estimates for major indicators vary between 0.5 and 5.5%. The deterministic operating cash flow estimate lies within plus or minus two standard deviations from the mean interval of the operating cash flow distribution. Based on this distribution, we estimate the downside risk, measured by cash flow at risk, at 1294 million euros. Overall findings shed some light on the economic and financial repercussions of extreme events such as the Covid-19 pandemic, providing us with a better understanding of how to design public policies and business strategies to recover from such an impact.

20.
Bioinformatics Research and Applications, Isbra 2022 ; 13760:369-380, 2022.
Article in English | Web of Science | ID: covidwho-2309148

ABSTRACT

Clustering viral sequences allows us to characterize the composition and structure of intrahost and interhost viral populations, which play a crucial role in disease progression and epidemic spread. In this paper we propose and validate a new entropy based method for clustering aligned viral sequences considered as categorical data. The method finds a homogeneous clustering by minimizing information entropy rather than distance between sequences in the same cluster. We have applied our entropy based clustering method to SARS-CoV-2 viral sequencing data. We report the information content extracted from the sequences by entropy based clustering. Our method converges to similar minimum-entropy clusterings across different runs and limited permutations of data. We also show that a parallelized version of our tool is scalable to very large SARS-CoV-2 datasets.

SELECTION OF CITATIONS
SEARCH DETAIL